17 research outputs found
Perceptual Image Similarity Metrics and Applications.
This dissertation presents research in perceptual image similarity metrics and applications, e.g., content-based image retrieval, perceptual image compression, image similarity assessment and texture analysis.
The first part aims to design texture similarity metrics consistent with human perception. A new family of statistical texture similarity features, called Local Radius Index (LRI), and corresponding similarity metrics are proposed. Compared to state-of-the-art metrics in the STSIM family, LRI-based metrics achieve better texture retrieval performance with much less computation. When applied to the recently developed perceptual image coder, Matched Texture Coding (MTC), they enable similar performance while significantly accelerating encoding. Additionally, in photographic paper classification, LRI-based metrics also outperform pre-existing metrics. To fulfill the needs of texture classification and other applications, a rotation-invariant version of LRI, called Rotation-Invariant Local Radius Index (RI-LRI), is proposed. RI-LRI is also grayscale and illuminance insensitive. The corresponding similarity metric achieves texture classification accuracy comparable to state-of-the-art metrics. Moreover, its much lower dimensional feature vector requires substantially less computation and storage than other state-of-the-art texture features.
The second part of the dissertation focuses on bilevel images, which are images whose pixels are either black or white. The contributions include new objective similarity metrics intended to quantify similarity consistent with human perception, and a subjective experiment to obtain ground truth for judging the performance of objective metrics. Several similarity metrics are proposed that outperform existing ones in the sense of attaining significantly higher Pearson and Spearman-rank correlations with the ground truth. The new metrics include Adjusted Percentage Error, Bilevel Gradient Histogram, Connected Components Comparison and combinations of such.
Another portion of the dissertation focuses on the aforementioned MTC, which is a block-based image coder that uses texture similarity metrics to decide if blocks of the image can be encoded by pointing to perceptually similar ones in the already coded region. The key to its success is an effective texture similarity metric, such as an LRI-based metric, and an effective search strategy. Compared to traditional image compression algorithms, e.g., JPEG, MTC achieves similar coding rate with higher reconstruction quality. And the advantage of MTC becomes larger as coding rate decreases.PhDElectrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113586/1/yhzhai_1.pd
OBJECTIVE SIMILARITY METRICS FOR SCENIC BILEVEL IMAGES
This paper proposes new objective similarity metrics for scenic bilevel images, which are images containing natural scenes such as landscapes and portraits. Though percentage
error is the most commonly used similarity metric for bilevel images, it is not always consistent with human perception. Based on hypotheses about human perception of bilevel images, this paper proposes new metrics that outperform percentage error in the sense of attaining significantly higher Pearson and Spearman-rank correlation coefficients with respect to subjective ratings. The new metrics include Adjusted Percentage Error, Bilevel Gradient Histogram and Connected Components Comparison. The subjective ratings come from similarity evaluations described in a companion paper. Combinations
of these metrics are also proposed, which exploit their complementarity to attain even better performance.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/111058/4/OBJECTIVE SIMILARITY METRICS FOR SCENIC BILEVEL IMAGES.pd
Similarity of Scenic Bilevel Images
This paper has been submitted to IEEE Transaction on Image Processing in May 2015.This paper presents a study of bilevel image similarity, including new objective metrics intended to quantify similarity consistent with human perception, and a subjective experiment to obtain ground truth for judging the performance of the objective similarity metrics. The focus is on scenic bilevel images, which are complex, natural or hand-drawn images, such as landscapes or portraits.
The ground truth was obtained from ratings by 77 subjects of 44 distorted versions of seven scenic images, using a modified version of the SDSCE testing methodology.
Based on hypotheses about human perception of bilevel images, several new metrics are proposed that outperform existing ones in the sense of attaining significantly higher Pearson and Spearman-rank correlation coefficients with respect to the ground truth from the subjective experiment. The new metrics include Adjusted Percentage Error, Bilevel Gradient Histogram and Connected Components Comparison. Combinations of these metrics are also proposed, which exploit their complementarity to attain even better performance.
These metrics and the ground truth are then used to assess the relative severity of various kinds of distortion and the performance of several lossy bilevel compression methods.http://deepblue.lib.umich.edu/bitstream/2027.42/111737/2/Similarity of Scenic Bilevel Images.pdfDescription of Similarity of Scenic Bilevel Images.pdf : Main article ("Similarity of Scenic Bilevel Images"
Towards Generic Image Manipulation Detection with Weakly-Supervised Self-Consistency Learning
As advanced image manipulation techniques emerge, detecting the manipulation
becomes increasingly important. Despite the success of recent learning-based
approaches for image manipulation detection, they typically require expensive
pixel-level annotations to train, while exhibiting degraded performance when
testing on images that are differently manipulated compared with training
images. To address these limitations, we propose weakly-supervised image
manipulation detection, such that only binary image-level labels (authentic or
tampered with) are required for training purpose. Such a weakly-supervised
setting can leverage more training images and has the potential to adapt
quickly to new manipulation techniques. To improve the generalization ability,
we propose weakly-supervised self-consistency learning (WSCL) to leverage the
weakly annotated images. Specifically, two consistency properties are learned:
multi-source consistency (MSC) and inter-patch consistency (IPC). MSC exploits
different content-agnostic information and enables cross-source learning via an
online pseudo label generation and refinement process. IPC performs global
pair-wise patch-patch relationship reasoning to discover a complete region of
manipulation. Extensive experiments validate that our WSCL, even though is
weakly supervised, exhibits competitive performance compared with
fully-supervised counterpart under both in-distribution and out-of-distribution
evaluations, as well as reasonable manipulation localization ability.Comment: Accepted to ICCV 2023, code: https://github.com/yhZhai/WSC
SOAR: Scene-debiasing Open-set Action Recognition
Deep learning models have a risk of utilizing spurious clues to make
predictions, such as recognizing actions based on the background scene. This
issue can severely degrade the open-set action recognition performance when the
testing samples have different scene distributions from the training samples.
To mitigate this problem, we propose a novel method, called Scene-debiasing
Open-set Action Recognition (SOAR), which features an adversarial scene
reconstruction module and an adaptive adversarial scene classification module.
The former prevents the decoder from reconstructing the video background given
video features, and thus helps reduce the background information in feature
learning. The latter aims to confuse scene type classification given video
features, with a specific emphasis on the action foreground, and helps to learn
scene-invariant information. In addition, we design an experiment to quantify
the scene bias. The results indicate that the current open-set action
recognizers are biased toward the scene, and our proposed SOAR method better
mitigates such bias. Furthermore, our extensive experiments demonstrate that
our method outperforms state-of-the-art methods, and the ablation studies
confirm the effectiveness of our proposed modules.Comment: Accepted to ICCV 2023, code:https://github.com/yhZhai/SOA
Scenic bilevel image similarity metrics MATLAB code
This item contains MATLAB code for scenic bilevel image similarity metrics described in the following two papers: (1) Y. Zhai and D.L. Neuhoff, Similarity of Scenic Bilevel Images, to appear in IEEE Transaction on Image Processing, 2016.
(2) Y. Zhai, D.L. Neuhoff and T.N. Pappas, Objective Similarity Metrics for Scenic Bilevel Images, IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. 2793-2797, Florence, Italy, May 2014.http://deepblue.lib.umich.edu/bitstream/2027.42/122736/1/Scenic bilevel image similarity metrics MATLAB code.zipDescription of Scenic bilevel image similarity metrics MATLAB code.zip : MATLAB cod
SUBJECTIVE SIMILARITY EVALUATION FOR SCENIC BILEVEL IMAGES
In order to provide ground truth for subjectively comparing compression methods for scenic bilevel images, as well as for judging objective similarity metrics, this paper describes the subjective similarity rating of a collection of distorted scenic bilevel images. Unlike text, line drawings, and silhouettes, scenic bilevel images contain natural scenes, e.g., landscapes and portraits. Seven scenic images were each distorted in forty-four ways, including random bit flipping, dilation, erosion and lossy compression. To produce subjective similarity ratings, the distorted images were each viewed by 77 subjects. These are then used to compare the performance of four compression algorithms and to assess how well percentage error and SmSIM work as bilevel image similarity metrics. These subjective ratings can also provide ground truth for future tests of objective bilevel image similarity metrics.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/111057/4/SUBJECTIVE SIMILARITY EVALUATION FOR SCENIC BILEVEL IMAGES.pd
Bilevel Image Similarity Ground Truth Archive
The data in this file is intended for scholarly, non-commercial use only. The images cannot be re-distributed. Copyright to the data is retained by Yuanhao Zhai and David L. Neuhoff. If issues arise, please contact [email protected] or [email protected] archive contains a set of seven bilevel images, the same images distorted in a number of ways and to a number of different degrees, subjective rating scores for each distorted image as to its similarity to its corresponding original, and the amounts of time required for the rating of each image.http://deepblue.lib.umich.edu/bitstream/2027.42/111059/3/Bilevel Image Similarity Ground Truth Archive.zi